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THERMAL TOLERANT AVKOELaSE FROM AC7IDOTHERMUS CELLULOLYTICUS 

i 
i 
i 

Governroem Interests 

5 The United States Government ha|; rights in this invention under Contract No. 

i 

DE-AC36-99GO10337 between t)je United States Department of Energy and the National 
Renewable Energy Laboratory, a 1 division of the Midwest Research Institute. 

i 

Field of the Invention 

10 The invention generally relates i\> a novel avicelase from Acidotherraus cellulolyticus, AviHL 
More specifically, the invention i plates to purified and isolated Avim polypeptides, nucleic acid 
molecules encoding the polypeptides, and processes for production and use of AviHI, as well as 
variants and derivatives thereof. ! 

Background of the Invention 

Plant biomass as a source of energy production can include agricultural and forestry products, 
associated by-products and waste] municipal solid waste, and industrial waste. In addition, over 
50 million acres in the United St^es are currently available for biomass production, and there are 
a number of terrestrial and aquatkj crops grown solely as a source for biomass (A Wiselogel, et ah 
Biomass feedstocks resources an|l composition— in CE Wyman, ed. Handbook on Bioethanol: 
Production and Utilization. Washington, DC: Taylor & Francis, 1996, pp 105-118). Biofuels 
produced from biomass include ;ethanoI 7 methanol, biodiesel, and additives for reformulated 
gasoline. Biofuels are desirablj because they add linle, if any, net carbon dioxide to the 
atmosphere and because they greatly reduce ozone formation and carbon monoxide emissions as 
compared to the environmental i output of conventional fuels. (P Bergeron. Environmental 
impacts of bioethanol — in CE Wijman, ed. Handbook on Bioethanol: Production and Utilization. 
Washington, DC: Taylor & Frances, 1996, pp 90-103).. 

i 

Plant biomass is the most abundant source of carbohydrate in the world due to the Hgnocellulosic 
30 materials composing the cell walls of all higher plants. Plant cell walls are divided into two 
sections, the primary and the secondary cell wails. The primary cell wall, which provides 
structure for expanding cells (ancj hence changes as the cell grows), is composed of three major 
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polysaccharides and one group q|f glycoproteins. The predominant polysaccharide, and most 
abundant source of carbohydrates, I is cellulose, while hemicellulose and pectin are also found in 
abundance. Cellulose is a linear Heta-(l,4)-D-glucan and comprises 20% to 30% of the primary 
cell wall by weight. The second jiry cell wall, which is produced after the cell has completed 
5 growing, also contains polysaccharides and is strengthened through polymeric lignin covalently 
cross-linked to hemicellulose. 

] 

Carbohydrates, and cellulose in particular can be converted to sugars by well-known methods 
including acid and enzymatic hydrolysis. Enzymatic hydrolysis of cellulose requires the processing 

10 of biomass to reduce size and facilitate subsequent hand lin g. Mild acid treatment is then used to 
hydrolyze pan or all of the hemictjllulose content of the feedstock. Finally, cellulose is convened 
to ethanol through the conceited action of cellulases and saccharolytic fermentation 
(simultaneous saccharification 1 lamentation (SSf)). The SSF process, using the yeast 
Saccharomyces cerevisiae for exajnple, is often incomplete, as it does not utilize the entire sugar 

15 content of the plant biomass, nanujly the hemicellulose fraction. 

i 

The cost of producing ethanol fijom biomass can be divided into three areas of expenditure: 
pretreatment costs, fermentation ; costs, and other costs. Pretreatment costs include biomass 
milling, pretreatment reagents,; equipment maintenance, power and water, and waste 

20 neutralization and disposal. The 'fermentation costs can include enzymes, nutrient supplements, 
yeast, maintenance and scale-up; j and waste disposal. Other costs include biomass purchase, 
transportation and storage, plant: labor, plant utilities, ethanol distillation, and administration 
(which may include technology-uue licenses). One of the major expenses incurred in SSF is the 
cost of the enzymes, as about onejkilogram of cellulase is required to fully digest 50 kilograms of 

25 cellulose. Economical production of cellulase is also compounded by factors such as the 
relatively slow gowth rates of ce||ulase-producing organisms, levels of cellulase expression, and 
the tendency of enzyroe-dependerijt processes to partially or completely inactivate enzymes due to 
conditions such as elevated ijemperaiure, acidity, proteolytic degradation, and solvent 
degradation. 

30 j 

Enzymatic degradation of cellulorje requires the coordinate action of at least three different types 
of cellulases. Such enzymes are gjiven an Enzyme Commission (EC) designation according to the 
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Nomenclature Committee of the International Union of Biochemistry and Molecular Biology 
(Eur. J. Biochem. 264: 607*609 ajid 610-650, 1999). Endo- beta-(i,4>glucanases (EC 3.2.1.4) 

cleave the cellulose strand randomjly along its length, thus generating new chain ends. Exo- beta- 

i 

(U4)-glucanases (EC 3.2.1.91) anj ptocessive enzymes and cleave cellobiosyl units (beta-(l,4)- 
5 glucose dimers) from free ends of pellulose strands. Lastly, beta-D-glucosidases (cellobiases: EC 
3.2.1.21) hydrolyze cellobiose to jjlucose. All three of these general activities are required for 
efficient and complete hydrolysis |?f a polymer such as cellulose to a subunit, such as the simple 
sugar, glucose. ; 

i . 
I 

i 

10 Highly thermostable enzymes hav>j: been isolated from die cellulolytic thermopbile Addothermus 
cellulotytwus gen. nov.> sp. wv.» 4i bacterium originally isolated from decaying wood in an acidic, 
thermal pool at Yellowstone National Park. A, Mohagheghi et al., (1986) Int. J. Systematic 
Bacteriology, 36(3): 435-443. Oth\: cellulase enzyme produced by this organism, the endoglucanase 
is known to display maxim jd activity at 75 °C to 83°C. M.P. Tucker et al. (1989), 

15 Bio/Technology, 7(8): 817-820. ijil endoglucanase has been described in U.S. Patent 5,275,944. 
The A. cellulolyticus El endoglucanase is an active cellulase; in combination with the 
exocellulase CBH I from Trichdderma reesei, El gives a high level of saccharification and 
contributes to a degree of synergism. Baker JO et al. (1994), A ppl. Biochem. Biotechnol , 45/46: 
245-256. The gene coding EI caujlytic and carbohydrate binding domains and linker peptide were 

20 described in U.S. Patent 5 9 536,65ij. El has also been expressed as a stable, active enzyme from a 
wide variety of hosts, including ii". colU Streptomyces Uvidms> Pichia pastoris^ cotton, tobacco, 
and Arabidopsis (Dai Z, Hookeri BS, Anderson DB, Thomas SR. Transgenic Res. 2000 Feb; 
9(l):43-54). 

25 The potential exists for the successful, commercial-scale expression of heterologous cellulases 7 

i 

and in particular novel cellulasesjwith or without any one or more desirable properties such as 
thermal lolerance and resistance to acid inactivation, proteolytic inactivation, and solvent 
inactivation. Such expression can| occur in filamentous fungi, bacteria, and other hosts. 

i 

30 There is a need within the art to ^jmerate alternative cellulase enzymes capable of assisting in the 
commercial-scale processing of vjellulose to sugar for use in biofuel production- Against this 
backdrop the present invention )|ias been developed. The potential exists for the successful, 

i 
i 
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i 

commercial-scale expression of jjteterologous cellulase polypeptides, and in particular novel 
cellulase polypeptides with or without any one or more desirable properties such as thermal 
tolerance, and partial or complete jresistance to extreme pH inactivation, proteolytic inactivation, 
solvent inactivation, chaotropic a|?ent inactivation, oxidizing agent inactivation, and detergent 

5 inactivation. Such expression can |:>ccur in fungi, bacteria, and other hosts. 

i 

t 

Summary of the Invention 

The present invention provides A\|im, a novel member of the glycoside hydrolase (GH) family of 
enzymes, and in particular a therjnal Tolerant glycoside hydrolase useful in the degradation of 
10 cellulose. AvilH polypeptides otj the invention include those having an amino acid sequence 
shown in SEQ ID NO:l s as well a|3 polypeptides having substantial amino acid sequence identity 
to the amino acid sequence of SEQ ID NO:l and useful fragments thereof, including, a catalytic 
domain having significant sequence similarity to the GH74 family, a fosj carbohydrate binding 
domain (type II) and a second carbohydrate binding domain (type HI). See FIG 1 . 

15 \ 

The invention also provides a [polynucleotide molecule encoding Avilll polypeptides and 

fragments of Avilll polypeptide^, for example catalytic and carbohydrate binding domains. 

Polynucleotide molecules of th(i invention include those molecules having a nucleic acid 

sequence as shown in SEQ ID Ncj:2; those that hybridize to the nucleic acid sequence of SEQ ID 

20 NO:2 under high stringency conditions; and those having substantial nucleic acid identity with 

i 

the nucleic acid sequence of SEQ |D NO:2. 

! 
i 

The invention includes variants |and derivatives of the A villi polypeptides, including fusion 
proteins. For example, fusion pjuteins of the invention include Avilll polypeptide fused to a 
25 heterologous protein or peptide ]|hat confers a desired function. The heterologous protein or 
peptide can facilitate purification, oligomerization, stabilization, or secretion of the Avilll 
polypeptide, for example. As jfurther examples, the heterologous polypeptide can provide 

i 

enhanced activity, including catalytic or binding activity, for Avilll polypeptides* where the 

enhancement is either additive j)r synergistic. A fusion protein of an embodiment of the 

30 invention can be produced, ||5r example, from an expression construct containing a 

polynucleotide molecule encoding Avilll polypeptide hi frame with a polynucleotide molecule 

for the heterologous protein. Embodiments of the invention also comprise vectors, plasmids, 

i 
i 

i 
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t 

expression systems, host cells, andjthe like, containing a AviHI polynucleotide molecule. Genetic 
engineering methods for the production of Avim polypeptides of embodiments of the invention 
include expression of a polynucleotide molecule in cell free expression systems and in cellular 
hosts, according to known methods 

! 

The invention further includes j compositions containing a substantially purified AviUI 

polypeptide of the invention and -ja carrier. Such compositions are administered to a biomass 

> 

containing cellulose for the reduction or degradation of the cellulose. 

10 The invention also provides reagcijts, compositions, and methods that are useful for analysis of 
AviHI activity. 

These and various other features -.jis well as advantages which characterize the present invention 
-will be apparent from a reading of the following detailed description and a review of the 
15 associated drawings. 

i 

The following Tables 1 and 2 incjjudes sequences used in describing embodiments of the present 
invention. In Table 1, the abbreviations are as follows: CD, catalytic domain; CBDJI. 
carbohydrate binding domain typei U; CBD_XU, carbohydrate binding domain type XU; and FN-in, 
20 fibronectin domain type m. Wkim used herein, N* indicates a string of unknown nucleic acid 
units, and X* indicates a string <j»f unknown amino acid units, for example about 50 or more. 
Table 1 includes approximate s^.jixt and stop information for segments, and Table 2 includes 
amino acid sequence data for segments. 



i 
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Brtj if Description of the Drawings 
FIG. 1 is a schematic representation of the gene sequence and amino acid segment organization. 
FIG 2 is a graphic representation op the glycoside hydrolase gene/protein families found in 
5 various organisms. ! 

Detailed Description 

Definitions: \ 

t 

The following definitions are provjided to facilitate understanding of certain terms used frequently 
10 herein and are not meant to limit tijie scope of the present disclosure: 

"Amino acid" refers to any of the 'twenty naturally occuring amino acids as well as any modified 
amino acid sequences. Modifications may include natural processes such as posttranslational 
processing, or may include chemjcal modifications which are known in the art. Modifications 
15 include hut are not limited tjo: phosphorylation, ubiquitination, acetylation, araidation, 
glycosylation, covalem attachment of flavin, ADP-ribosylation, cross linking, iodination, 
methylation, and alike. 

"Antibody" refers to a Y-shaped :jnolecule having a pair of antigen binding sites, a hinge region 
20 and a constant region. Fragment^ of antibodies, for example an antigen binding fragment (Fab), 
chimeric antibodies, antibodies Ijaving a human constant region coupled to a murine antigen 
binding region, and fragments thcj^eof, as well as other well known recombinant antibodies are 
included in the present invention. ; 

25 "Antisense" refers to polynucleotide sequences that are complementary to target "sense" 
polynucleotide sequence. j 



"Binding activity' 1 refers to any jictivity that can be assayed by characterizing the ability of a 
polypeptide to bind to a substrate j The substrate can be a polymer such as cellulose or can be a 
30 complex molecule or aggregate <jf molecules where the entire moiety comprises at least some 
cellulose. j 

i 

i 
i 



PAGE 10/37 ' RCVD AT 5/5/2005 12:45:15 PM [Eastern Daylight Time] 1 SVR:USPTO-EFXRF-1/25 ' DMS:2738300 ' CSID:3033847499 ' DURATION (mm-ss):12-16 



05-05-05 10:48AM FROM-NREL LEGAL OFFICE 3033847499 T-488 P. 11/37 F-307 

— ' Attorney Docket No. NREL 01-36 

"Cellulase activity" refers to any [activity that can be assayed by characterizing the enzymatic 
activity of a cellulase. For examplle, cellulase activity can be assayed by detennining how much 
reducing sugar is produced during a fixed amount of time for a set amount of enzyme (see Irwin 
et at, (1998) 1 Bacteriology, 171*9-1714), Other assays are well known in the art and can be 
5 substituted. ; 

"Complementary" or "complex ^laxity" refers to the ability of a polynucleotide in a 
polynucleotide molecule to forjn a base pair with another polynucleotide in a second 
polynucleotide molecule. For example, the sequence A-G-T is complementary to the sequence T- 
to C-A. Complementarity may be! partial, m which only some of the polynucleotides match 
according to base pairing, or complete, where all the polynucleotides match according to base 
pairing. 

"Expression" refers to transcription and translation occurring within a host cell. The level of 
15 expression of a DNA molecule in la host cell may be determined on the basis of either the amount 
of corresponding mRNA that is pj^sent within the cell or the amount of DNA molecule encoded 
protein produced by the host cejl (Sambrook et al., 1989, Molecular cloning; A Laboratory 
Manual, 18.1-18.88). 

20 "Fusion protein" refers to a fiijsi protein having attached a second, heterologous protein. 
Preferably, the heterologous protejin is fused via recombinant DNA techniques, such that the first 
and second proteins are expressed in frame. The heterologous protein can confer a desired 
characteristic to the fusion proijein, for example, a detection signal, enhanced stability or 
stabilization of rhe protein, facilitated oligoraerization of the protein, ot facilitated purification of 

25 the fusion protein. Examples cjf heterologous proteins useful in the fiision proteins of the 
invention include molecules bavirjg one or more catalytic domains of AvilH, one or more binding 
domains of Avim, one or more catalytic domains of a glycoside hydrolase other than AvillL, one 
or more binding domains of a glycoside hydrolase other than Avim, or any combination thereof. 
Further examples include immunoglobulin molecules and portions thereof, peptide tags such as 

30 histidine tag (6-His), leucine zipjj>er y substrate targeting moieties, signal peptides, and the like. 
Fusion proteins are also meant to j encompass variants and derivatives of Avim polypeptides thai 
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are generated by conventional siij;-directed mutagenesis and more modern techniques such as 
directed evolution, discussed infra: 

''Genetically engineered" refers t > any recombinant DNA or RNA method used to create a 

i 

prokaryotic or eukaryotic host cellithat expresses a protein at elevated levels, at lowered levels, or 
in a mutated form. In other wordri, the host cell has been transfected, transformed, or transduced 
with a recombinant polynucleoticfci: molecule, and thereby been altered so as to cause the cell to 
alter expression of the desired protein. Methods and vectors for genetically engineering host cells 
are well known in the ait; for example various techniques are illustrated in Current Protocols in 
Molecular Biology, Ausubel et alj eds. (Wiley & Sons, New York, 1988, and quarterly updates). 
Genetically engineering techniqu<js include but are not limited to expression vectors, targeted 
homologous recombination and gjme activation (see, for example, U.S. Patent No. 5,272,071 to 
Chappel) and trans activation by >j;ngmeered transcription factors (see, for example, Segal et al.» 
\999 % Proc Natl Acad Sci USA 96i|6):2758-63). 

I 

"Glycoside hydrolase family" refcjrs to a family of enzymes which bydrolyze the glycosidic bond 
between two or more carbohydr^jies or between a carbohydrate and a non-carbohydrate moiety 
(Henrissat B., (1991) Biochem. J. ■ 280:309-316). Identification of a putative glycoside hydrolase 
family member is made based im an amino acid sequence comparison and the finding of 
significant sequence similarity wijhin the putative member's catalytic domain, as compared to the 
catalytic domains of known familjj members. 

i 
i 

'Homology" refers to a degree of complementarity between polynucleotides, having significant 
effect on the efficiency and strength of hybridization between polynucleotide molecules. The 
term also can refer to a degree of i|imilarity between polypeptides. 

"Host cell" or "host cells" refe^j to cells expressing a heterologous polynucleotide molecule, 

i 

Host cells of the present inventioij express polynucleotides encoding Aviin or a fragment thereof. 
Examples of suitable host cells ljseful in the present invention include, but are not limited to, 
30 prokaryotic and eukaryotic cells. ' Specific examples of such cells include bacteria of the genera 
Escherichia, Bacillus, and Saln\ondla, as well as members of the genera Pseudomonas, 

Streptomyc&s, and Staphylococcus], fungi, particularly filamentous fungi such as Trichoderma and 

I 
i 

i 
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Aspergillus, Phanerockaete chrys^sporium and other white rot fungi; also other fungi including 
fusaria> molds, and yeast including Saccharomyces sp., Pichia sp., and Candida sp. and the like; 
plants e.g. Arabidopsis. cotton, barley, tobacco, potato, and aquatic plants and the like; SF9 insect 
cells (Summers and Smith, 1987,1 Texas Agriculture Experiment Station Bulletin* 1555), and the 



:lude mammalian cells such as human embyonic kidney cells 
(CHO) cells (Puck et al„ 1958, Proc Natl Acad. Set USA 60, 
noma cells (HELA) (ATCC CCL 2), human liver cells (Hep 
G2) (ATCC HB8065), human }j>reast cancer cells (MCF-7) (ATCC HTB22), human colon 
carcinoma cells (DLD-1) (ATCC! CCL 221), Daudi cells (ATCC CRL-213), murine myeloma 



like. Other specific examples in* 
(293 cells), Chinese hamster ovarj 
1275-1281), human cervical care 



10 cells such as P3/NSI/l-Ag4-l (A r 
CRL-I58l)andthe like. 



:CC TIB-18), P3X63Ag8 (ATCC TIB-9), SP2/0-Agl4 (ATCC 



"Hybridization" refers to the pa [ring of complementary polynucleotides during an annealing 
period. The strength of hybridization between two polynucleotide molecules is impacted by the 



15 homology between the two me 



lecules, stringency of the conditions involved, the melting 



temperature of the formed hybrid md the G:C ratio within the polynucleotides. 

i 

"Identity" refers to a comparison, between pairs of nucleic acid or amino acid molecules. 
Methods for determining sequence identity are known. See, for example, computer programs 
20 commonly employed for this purpose, such as the Gap program (Wisconsin Sequence Analysis 
Package, Version 8 for Unix, Gjenetics Computer Group, University Research Park, Madison 
Wisconsin), that uses the algorithr| of Smith and Waterman, I98l,^rfv. Appl Math., 2; 482-489. 

! 

"Isolated" refers to a polynucleoiide or polypeptide that has been separated from at least one 
25 contaminani (polynucleotide or polypeptide) with which it is normally associated. For example, 
an isolated polynucleotide or polypeptide is in a context or in a tbrra that is different from that in 
which it is found in nature. j 

i 

i 

"Nucleic acid sequence" refers toiche order or sequence of deoxyribonucleotides along a strand of 
30 deoxyribonucleic acid. The ordejr of these deoxyribonucleotides determines the order of amino 



acids along a polypeptide chain, 
acid sequence. 



The deoxyribonucleoiide sequence thus codes for the amino 
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"Polynucleotide^ refers to a Hi j ear sequence of nucleotides. The nucleotides may be 
ribonucleotides, or deoxyribonucltjotides, or a mixture of both. Examples of polynucleotides in 
the context of the present invention include single and double stranded DNA, single and double 
5 stranded RNA, and hybrid raolecijles having mixtures of single and double stranded DNA and 
RNA. The polynucleotides of |the present invention may contain one or more modified 
nucleotides. ; 

,f Protein, ,T "peptide," and "polypepjide" are used interchangeably to denote an amino acid polymer 
10 or a set of two or more interacting ox bound amino acid polymers. 

"Purify," or "purified" refers to a jjarget protein that is free from at least 5-10% of contaminating 
proteins. Purification of a protein [from contaminating proteins can be accomplished using known 
techniques, including ammoniuiji sulfate or ethanol precipitation, acid precipitation, heat 
j 5 precipitation, anion or cation exchange chromatography, phosphocellulose chromatography, 
hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite 
chromatography, size-exclusion c|uomatography, and lectin chromatography. Various protein 
purification techniques are illustrated in Current Protocols in Molecular Biology, Ausubel et al., 
eds. (Wiley & Sons, New York, ilj-88, and quarterly updates). 

20 

"Selectable marker" refers to a rajnker that identifies a cell as having undergone a recombinant 

DNA or RNA event Selectable >jnarkers include, for example, genes that encode antimetabolite 

resistance such as the DHFR proljnn that confers resistance to methotrexate (Wigler et al, 1980, 

Proc Natl Acad Sci USA 77:356 | r ; O'Hare et al, 1981, Proc Natl Acad Sci USA, 78:1527), the 

i 

25 GPT protein that confers resistance to mycophenolic acid (Mulligan & Berg, 1981, PNAS USA, 
78:2072), the neomycin resistance marker that confers resistance to the aminoglycoside G-418 
(Calberre-Garapin et aU 1981, J\Mol Biol, 150:1), the Hygro protein that confers resistance to 
bygromycin (Santerre et al„ ]j984 s Gene 30:147), and ihe Zeocin™ resistance marker 
(Invitrogen). In addition, the hprpes simplex virus thymidine kinase, hypoxanthine-guanine 

30 phosphoribosyltransferase and adenine phosphoribosyltransferase genes can be employed in tk\ 
hgprt" and aprt" cells, respectively.! 



i 
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"Stringency" refers to the conditions (temperature, ionic strength, solvents, etc) under which 
hybridization between polynucleotides occurs. A hybridzation reaction conducted under high 
stringency conditions is one that jvill only occur between polynucleotide molecules that have a 
high degree of complementary ljase pairing (85% to 100% identity). Conditions for high 
5 stringency hybridization, for example, may include an overnight incubation at about 42°C for about 

2.5 hours in 6 X SSC/0.1% SDS> fallowed by washing of the filters in 1.0 X SSC at 65°C, 0.1% 

i 

SDS. A hybridization reaction conducted under moderate stringency conditions is one that will 

i 

occur between polynucleotide modules that have an intermediate degree of complementary base 

i 

pairing (50% to 84% identity). i 

10 i 

"Substrate targeting moiety 11 refeni to any signal on a substrate, either naturally occuning or 
genetically engineered, used to tar. pet any A vim polypeptide or fragment thereof to a substrate. 
Such targeting moieties include liijands that bind to a substrate structure. Examples of 
ligand/recepror pairs include carbohydrate binding domains and cellulose. Many such substrate- 

15 specific ligands are known and anj useful in the present invention to target a AvilH polypeptide or 

fragment thereof to a substrate. A! novel example is a AviHt carbohydrate binding domain that is 

I 

used to tether other molecules to aj cellulose-containing substrate such as a febric. 

"Thermal tolerant" refers to the property of withstanding partial or complete inactivation by heat 
20 and can also be described as thejmal resistance or thermal stability. Although some variation 
exists in the literature, the following definitions can be considered typical for the optimum 
temperature range of stability anil activity for enzymes: psycrophilic (below freezing to 10C); 
mesophilic (10 Q C to 50°C); thermophilic (50°C to 75*C); and caldophilic (75°C to above boiling 
water temperature). The stability jand catalytic activity of enzymes are linked characteristics, and 
25 the ways of measuring these prof jerries vary considerably. For industrial enzymes, stability and 
activity axe best measured under juse conditions, often in the presence of substrate. Therefore, 
cellulases that must act on proce^j; streams of cellulose must be able to withstand exposure up to 
thermophilic or even caldophilic tjraperatures for digestion times in excess of several hours. 

30 In encompassing a wide varierj' of potential applications for embodiments of the present 
invention, thermal tolerance refer]; to the ability to function in a temperature range of from about 
15°C to about 100°C. A prefeneji range is fiora about 30°C to about 80°C. A highly preferred 
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range is from about SO°C to about ;70°C Fot example, a protein that can function at about 45*C 
is considered in the preferred ran j;e even though it may be susceptible to partial or complete 
inactivation at temperatures in range above about 45°C and less than about 80°C. For 
polypeptides derived from organisms such as Acidothermus, the desirable property of thermal 

5 tolerance among is often accompanied by other desirable characteristics such as: resistance to 
extreme pH degradation, resistance to solvent degradation, resistance to proteolytic degradation, 
resistance to detergent degradation, resistance to oxidizing agent degradation, resistance to 
chaotroptc agent degradation and j-esistance to general degradation. Cowan DA in Danson MJ et 
al. (1992) The Archaebacteria. B | ochemistrv and Biotechnology at 149-159, University Press, 

10 Cambridge, ISBN 1855780100. Jjlere 'resistance 1 is intended to include any partial or complete 
level of residual activity. When \\ polypeptide is described as thermal tolerant it is understood 
That any one, more than one, or noijje of these other desirable properties can be present. 

"Variant", as used herein, means j a polynucleotide or polypeptide molecule that differs from a 
15 reference molecule. Variants ijau include nucleotide changes that result in amino acid 
substitutions, deletions, fusions J or truncations in the resulting variant polypeptide when 
compared to the reference polypeptide. 

i 

Sector,' 1 "extra-chromosomal v* ctor" or "expression vector" refers to a first polynucleotide 
20 molecule, usually double-strandei, which may have inserted into it a second polynucleotide 
molecule, for example a fon ign or heterologous polynucleotide. The heterologous 
polynucleotide molecule may or Itnay not be naturally found in the host cell, and may be, for 
example, one or more additional i\cxpy of the heterologous polynucleotide naturally present in the 
host genome. The vector is adapted for transporting the foreign polynucleotide molecule into a 
25 suitable host cell. Once in the ho*it cell, the vector may be capable of integrating into the host cell 
chromosomes. The vector ma;' optionally contain additional elements for selecting cells 
containing the integrated polynucleotide molecule as well as elements to promote transcription of 
mRNA from transfected DNA. ! Examples of vectors useful in the methods of the present 
invention include, but are not lijjoited to, plasmids, bacteriophages, cosmids, retroviruses, and 
30 artificial chromosomes- 

i 
i 
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Within the application, unless otherwise stated, the techniques utilized may be found in any of 
several well-known references, sujh as: Molecular Cloning: A Laboratory Manual (Sarabrook et 
aL (1 989) Molecular cloning: A laboratory Manual), Gene Expression Technology (Methods in 
Enzymology, Vol 185, edited by p. Goeddel> 1991 Academic Press, San Diego, CA), "Guide to 



10 



Protein Purification" in Methods 
Ihc.)> PCR Protocols; A Guide to 



n Enzymology (M P. Deutshcer, 3d-, (1990) Academic Press, 
Methods and Applications (Imus et aL (1990) Academic Press, 
San Diego, CA\ Culture of Anirtjal Cells: A Manual of Basic Technique, 2 nd ed, (ILL Fresbney 
(1987) lass, Inc., New York, NY).; and Gene Transfer and Expression Protocols, pp 109-128, ed. 
EJ. Murray, The Humana Press In|c., Clifton, NJ.)- 



0-Glyeoside Hydrolases: 

i 

Glycoside hydrolases are a large j and diverse family of enzymes that hydrolyse the glycosidic 
bond between two carbohydrate jmoieties or between a carbohydrate and a non-carbohydrate 
moiety (See FIG. 2), Glycoside Ijydrolase enzymes are classified into glycoside hydrolase (GH) 

is families based on significant amjno acid similarities within their catalytic domains. Enzymes 

i 

having related catalytic domains j«e grouped together within a family, (Henrissat et aL, (1991) 
supra, and Henrissat et aL (1996);| Biochem. J. 316:695-696), where the underlying classification 
provides a direct relationship bsj;ween the GH domain amino acid sequence and how a GH 
domain will fold. This information ultimately provides a common mechanism for how the 
20 enzyme will hydrolyse the glycosidic bond within a substrate, ?.e, either by a retaining 
mechanism or inverting mechaniajn (Henrissat., B, (1991) supra). 

Cellulases belong to the GH famijy of enzymes. Cellulases are produced by a variety of bacteria 
and fungi to degrade the P~l,4 jglycosidic bond of cellulose and to so produce successively 

25 smaller fragments of cellulose arjjj ultimately produce glucose. At present, cellulases are found 
within are at least 1 1 different GJjl families. Three different types of cellulase enzyme activities 
have been identified within thesj: GH families: exo-acting cellulases which cleave successive 
disaccharide units from the non-reducing ends of a cellulose chain; endo-acting cellulases which 
randomly cleave successive disaccharide units within the cellulose chain; and {J-glucosidases 

30 which cleave successive disacchajide units to glucose (J. W. Deacon, (1997) Modem Mycology, 
3rd Ed., ISBN: 0-632-03077-1, 9/4-98). 
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I 

Many cellulases are characterized ijy having a multiple domain unit within their overall structure, 
a GH or catalytic domain is joine>jl to a carbohydrate-binding domain (CBD) by a glycosylated 
linker peptide (Koivula et aL, (19*j6) Protein Expression and Purification 8:391-400). As noted 
above, cellulases do not belong to jiny one family of GH domains, but rather have been identified 
5 within at least 11 different Glfi families to date. The CBD type domain increases the 
concentration of the enzyme on |the substrate, in this case cellulose, and the linker peptide 

provides flexibility for both larger jlomains. 

i 
i 
i 

Conversion of cellulose to glucate is an essential step in the production of ethanol or other 
to biofuels from biomass. Celluljises are an important component of this process, where 
approximately one kilogram of cj;llulase can digest fifty kilograms of cellulose. Within this 



15 



process, thermostable cellulases hi 
temperatures and under other cq 



ive taken precedent, due to their ability to function at elevated 
iditions including pH extremes* solvent presence, detergent 



presence, proteolysis, etc. (see Covan OA (1992), supra). 

Highly thermostable cellulase en2|mies are secreted by the cellulolytic theraophile Acidothermus 
cellulolyticus (U.S. Patent Nos. j 5,275,944 and 5,110,735). This bacterium was originally 
isolated from decaying wood inj an acidic, thermal pool at Yellowstone National Park and 
deposited with the American Typi Culture Collection (ATCC 43068) (Mohaghegbi et aL, (1986) 



20 Int. J. System. Bacterial, 36:435- 



43). 



Recently, a thermostable cellulate. El endoglucanase, was identified and characterized from 
Acidothermus cellulolyticus (U.&j Patent No. 5,536,655). The El endoglucanase has maximal 
activity between 75 and 83°C andjis active to a pH well below 5, Thermostable cellulase, and E 1 
25 endoglucanase, are usefUl in the c, 
the conversion of cellulose to glm 
important alternative fuel source, 
and provides a use, in some cases j 



inversion of biomass to biofuels, and in particular, are useful in 
Jose. Conversion of biomass to biofuel represents an extremely 
ithat is more environmentally friendly than conventional fuels, 
for waste products. 



30 AviIII: 

As described more fully in the Examples below, A villi, a novel thermostable cellulase, has now 
been identified and characterized! The predicted amino acid sequence of Avim (SEQ ID NO:l) 
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has an organization characteristic bf a cellulase enzyme. AvilU contains a carbohydrate binding 

domain - linker domain - catalytic! domain -linker domain- fibronectin domain - linker domain - 

i 

carbohydrate binding domain unii] In particular, AvflDQ includes a carbohydrate binding domain 
type HI (CBDIU) (amino acids ftqj* about A35 to about A187), a GH.74 catalytic domain (amino 
5 acids from about N231 to about j?870), and a CBDu (amino acids from about G1021 to about 
SI 121). 

As discussed in more detail belQjv (Example 2), significant amino acid similarity of Avilfl to 
other cellulases identifies A vim >j.s a cellulase. In addition, the predicted amino acid sequence 
10 (SEQ ID NO: 1) indicates that a CJBD type HI domain is present as characterized by Tomme P. et 
al. (1995), in Enzymatic Degradation of Insoluble Polysaccharides (Saddler JN & Penner M, 
eds,), at 142-163, American Chenjiical Society, Washington. See also Tomme, P. & Claeyssens, 
M. (1989) FEBS Lett. 243, 23*j243l243; Gilkes, N.R et al., (1988) J.Biol.Chem. 263, 10401- 
10407. ; 



15 



Avim, as noted above* has a cata|ytic domain, identified as belonging to the GH74 family. The 
GH74 domain family includes a n,|unber of exoglucanases, for example, from Cellulomonasfimi % 
and exoglucanase E3 isolated frain Thermobifida Juscq. The GH74 members degrade substrate 



using an inverting mechanism. B 



ting a member of the GH?4 family of proteins identifies Avim 



20 as potentially having cellulase acti vity, 



Avim is also a thermostable cj;llulase as it is produced by the themophile Acidothermus 
cellulofyticus. As discussed, AvIiDI polypeptides can have other desirable characteristics (see 
Cowan DA (1992), supra). Lilje other members of the cellulase family, and in particular 

25 thermostable cellulases, Avim polypeptides are useful in the conversion of biomass to biofiiels 
and biofuel additives, and in particular, biofuels from cellulose. It is envisioned that Avim 
polypeptides could be used for ! other purposes, for example in detergents, pulp and paper 
processing, food and feed processing, and in textile processes. Avim polypeptides can be used 
alone or in combination with one jor more other cellulases or glycoside hydrolases to perform the 

30 uses described herein or known vfithin the relevant an, all of which are within the scope of the 
present disclosure. j 
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AvtfH Polypeptides: | 

AviUI polypeptides of the invention include isolated polypeptides having an amino acid sequence 
as shown below in Example 1; Taj>le 3_and in SEQ ID NO:l, as well as variants and derivatives, 
including fragments, having substantial identity to the amino acid sequence of SEQ ID NO: 1 and 
5 that retain any of the functional activities of AviUI. AviUI polypeptide activity can be determined, 
for example, by subjecting the vajiant, derivative, or fragment to a substrate binding assay or a 
cellulase activity assay such as tho^e described in Irwin D et aL, J- Bacteriology 180(7): 1709-1714 
(April 1998). ; 

10 Table 3. AviUI amino acid sequence. (SEQ ID NO: 1) 

| 

MDRSENXRLTMRSRRLVSLLJUVTASPAljAAALGVLPIAITASPAHAATTO 
P YTWSNVAI GGGGFVDG I VFNEGAPG I } ^ YWTDl GGM YRWDAANGRW I PL 
LDWVGWNNWGYNGWS I AADf»TNTNKVy| AAVGMYTNSWDPNOGAI LRSSD 
15 QGATWQITPLPFKLCC^pGRGMGERU|.VDPNNDNILYF<^P5GFtGl.WRS 

iVQSDIQGWWVAFDKSSSSLGQ 



TDSGATW S QMTNFPDVGTY I AN PTDTT* 
ASKTIFVGVAOPNNPVPWSRDGGATWQ? .VPGAPTQFl PHKjGVFDPVNHVL 
YIATSNTGGPYDGSSGnVWKPSVTSGTl 'TRI S PVPSTDTANPYFGYSGLT 

idrqhpntimvatqi swwpdti i frstj 
20 ld i sae pwlt fgvqpn p pvps pklgwm1 
atndltkwdsggqihiapmwgleh'Ta:. 

THADVTAV PSTI FTSPVFTTGTSVPYAJ 
VAFSTDGGKttWPQGSE PGGVTTGGTVA,- 



GFGNSWAASQGVPAKAQl^DRVNPKT^rYAIiSNGTFYRSTDGGVTFOPVA 



25 AGLPSSGAVGVMFHAVPGKEGDLWX^ 
VNVGFGKSAPGSSYPAVFWGTIGGVTl 
NWGQAI TGDHANLRR WI GTNGRGI VYl'ID IGGAPSGSPS PS VS PSAS PSL 
SPSPSPSSSPSPSPSPSSSPSSSPS^Sil'SPSPSPSRSPSPSASPSPSSSP 



iGGATWTRI WDWTSYPNRSLRYV 
'EAKAI DPFNSDRMCYGTGATLY 
"MDLISPPSGAPLISALGDLGGF 
:LN PS X I VRAGS FDPS SQPNDRH 
.SADGSRFVWAPGOPGQPWYAV 



SGX.YHSTNGGSSWSAITGVSSA 
iAYRSDDCGTTWVLIlTDDQHQYG 



SPSSSPSSSPSPTPSSSPVSGGVKVQYI 
30 SSVDLSTVTVRYWFTRPGGSSTLVYWC? 
ADTYLQX* 



mDS APGDrtQI KPGLQWNTGS 
J WAAI GCGNI RAS FGS VNPATPT 



As listed and described in Table&j 3 and 2, the isolated Avilll polypeptide includes an N-tenninal 
35 hydrophobic region that functions ias a signal peptide, having an amino acid sequence that begins 
with Metl and extends to about A|*4; a carbohydrate binding domain having sequence similarity to 
such type HI domains that begins iwith about A3 5 and extends to about A187, a catalytic domain 
having significant sequence similarity to a GH74 family domain that begins with about N231 and 
extends to about P870, a fibronecjin type III domain that begins with about D901 and extends to 
40 about G985, a carbohydrate binding domain type II region that begins with about G1021 and 
extends to about SI 121, Varijints and derivatives of AviUI include, for example, AvilH 
polypeptides modified by covalemjor aggregative conjugation with other chemical moieties, such as 
glycosyl groups, polyethylene glycol (PEG) groups, lipids, phosphate, acetyl groups, and the like. 
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i 

Hie amino acid sequence of Avill| polypeptides of the invention is preferably at least about 60% 
identical, more preferably at least qjxmt 70% identical, or in some embodiments at least about 90% 
identical, to the Aviin amino aciij sequence shown above in Table 3 and SEQ ID NO: I. The 
5 percentage identity also termed h|imology (see definition above) can be readily determined, for 
example, by comparing the two jpolypeptide sequences using any of the computer programs 
commonly employed for this purpose, such as the Gap program (Wisconsin Sequence Analysis 
Package, Version 8 for Unix, Gjsietics Computer Group, University Research Park, Madison 
Wisconsin), which uses the algorithm of Smith and Waterman, 1981, Adv. Appl Math 2: 482-489. 



10 



Variants and derivatives of the |\viin polypeptide may further include, for example, fusion 
proteins formed of a A vim polypptide and a heterologous polypeptide. Preferred heterologous 
polypeptides include those that facilitate purification, oligomerizanon, stability, or secretion of 
the Avim polypeptides. 



15 



AviHl polypeptide variants and derivatives, as used in the description of the invention, can contain 

i ( 

conservatively substituted amino ajiids, meaning that one or more amino acid can be replaced by an 
amino acid that does not alter tbjr secondary and/or tertiary structure of the polypeptide. Such 
substitutions can include the replacement of an amino acid, by a residue having similar 

20 physicochemical properties, such j*s substituting one aliphatic residue (Be, Val, Leu, or Ala) for 
another, or substitutions between <|>asic residues Lys and Axg, acidic residues Glu and Asp, amide 
residues Gin and Asn 7 hydroxy.! residues Ser and Tyr, or aromatic residues Phe and Tyr. 
Phenotypically silent amino acid exchanges are described moie folly in Bowie ei aL 1990, Science 
247:1 306-1 3 10. In addition, functjonal Avim polypeptide variants include those having amino acid 

25 substitutions, deletions, or additicijts to the amino acid sequence outside functional regions of the 
protein, for example, outside the catalytic and carbohydrate binding domains. These would include, 

for example, the various linker sentences that connect functional domains as defined herein. 

I 

The A villi polypeptides of the pnjsent invention are preferably provided in an isolated form, and 

i 

30 preferably are substantially purij|ed. The polypeptides may be recovered and purified from 
recombinant cell cultures by knowji methods, including, for example, ammonium sulfate or ethanol 
precipitation, anion or cation ijxchange chromatography, phosphocellulose chromatography, 
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hydrophobic interaction chromatography, affinity chromatography, bydroxylapatite 
chromatography, and lectin chroma j:ography. Preferably, high performance liquid chromatography 
(HPLC) is employed for purification . 

5 Another preferred form of AvilH polypeptides is that of recombinant polypeptides as expressed by 
suitable hosts. Furthermore, die ijiosts can simultaneously produce other cellulases such that a 
mixture is produced comprising a j Avim polypeptide and one or more other cellulases. Such a 
mixture can be effective in crude fe [mentation processing or other industrial processing. 

i 

10 A vi 1TI polypeptides can be fused ito heterologous polypeptides to facilitate purification. Many 
available heterologous peptides (p|iptide tags) allow selective binding of the fusion protein to a 
binding partner. Non-limiting exatjnples of peptide tags include 6-His, thioredoxin, heraaglutinin, 

GST, and the OmpA signal sequi[nce tag, A binding partner that recognizes and binds to the 

i 

heterologous peptide can be any rc|olecule or compound, including metal ions (for example, metal 
15 affinity columns), antibodies, annbj>dy fragments, or any protein or peptide that preferentially binds 
the heterologous peptide to permit jjiurification of the fusion protein- 

Avim polypeptides can be modified to facilitate formation of Avim oligomers. For example, Avim 
polypeptides can be fused to peptide moieties that promote oligomerization, such as leucine zippers 

20 and certain antibody fragment polypeptides, for example, Fc polypeptides. Techniques for 
preparing these fusion proteins are 'known, and are described, for example, in WO 99/31241 and in 
Cosman etal-, 2001 Immunity 1*1:123-133. Fusion to an Fc polypeptide offers the additional 
advantage of facilitating purification by affinity chromatography over Protein A or Protein G 
columns. Fusion to a leucine-zippv|T 0-Z), for example, a repetitive heptad repeat, often with four or 

25 five leucine residues interspersed ivith other amino acids, is described in Landschultz et al., 1988, 
Science, 240:1759. j 

It is also envisioned that an expanded set of variants and derivatives of Avilll polynucleotides 
and/or polypeptides can be generated to select for useful molecules, where such expansion is 
30 achieved not only by conventionaj methods such as site-directed mutagenesis (SDM) but also by 
more modern techniques, either independently or in combination. 
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Site-direcied-mutagenesis is considised an informational approach to protein engineering and can 
rely on high-resolution crystallognnjhic structures of target proteins and some stratagem for specific 
amino acid changes (Van Den Bur>j;, B.; Vriend, G.; Veltman, OJL; Venema, G,; Eijsink, V.G.H. 
Proc. Nat. Acad. Sci. U.S. 1998, £|5, 2056-2060). For example, modification of the amino acid 

5 sequence of AvilB polypeptides caij be accomplished as is known in the ait, such as by introducing 
mutations at particular locations lj-y oligonucleotide-directed mutagenesis (Walder et al.,1986, 
Gene, 42:133; Bauer et al., 1985, tfene 37:73; Craik, 1985, BioTecbniques, 12-19; Smith et al., 
1981, Genetic Engineering: Principles and Methods, Plenum Press; and U.S. Patent Wo. 4,518,584 
and U.S. Patent No. 4,737,462).; SDM technology can also employ the recent advent of 

10 computational methods for identilj/ing site-specific changes for a variety of protein engineering 
objectives (Hellinga, H.W. Nature ifttucturai Biol. 1998, 5, 525-527). 

i 

The more modern techniques include, but are not limited to, non-informational mutagenesis 
techniques (referred to generically t[s "directed evolution''). Directed evolution, in conjunction with 

15 high-throughput screening, allocs testing of statistically meaningful variations in protein 
conformation (Arnold, F.H. Natunj Biotechnol. 1998, 16, 617-618). Directed evolution technology 
can include diversification methodH similar to that described by Crameri A. et al. (1 998, Nature 391 : 
288-291), site-saturation mutagenesis, staggered extension process (StEP) (Zhao, H.; Giver, L.; 
Shao, Z.; Affholter, J.A.; Amoiji F.H. Nature Biotechnol. 1998, 16, 258-262), and DNA 

20 synthesis/reassembly (U.S. Parent ;j ,965,408). 

j 

Fragments of the A vim polypeptide can be used, for example, to generate specific anti-AviHI 
antibodies. Using known selectijun techniques, specific epitopes can be selected and used to 
generate monoclonal or polyclonal antibodies. Such antibodies have uililty in the assay of AvilH 
25 activhy as well as in purifying recombinant AvilE polypeptides from genetically engineered host 
cells. i 

j 

AviUI Polynucleotides: 

The invention also provides polynucleotide molecules encoding the Avilll polypeptides discussed 

30 above. AviUI polynucleotide »|nolecules of the invention include polynucleotide molecules 

i 

having the nucleic acid sequence jhown in Table 4 and SEQ ID NO: 2, polynucleotide molecules 

that hybridize to the nucleic acid jsequence of Table 4 and SEQ ID NO:2 under high stringency 

i 
i 

i 
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hybridization conditions (for example, 42°, 2.5 hr„ 6X SCC, (U%SDS); and polynucleotide 
molecules having substantial nucjjeic arid sequence identity with the nucleic acid sequence of 
Table 4 and SEQ ID NO:2, particularly with those nucleic acids encoding the catalytic domain, 
GH74 (from about amino acid A:|7 to about G776), the carbohydrate binding domain III (from 

5 about amino acid V859 to about ati least Q946). 

i 

i 
i 

Table 4. A vim nucleotide sequtloce, (SEQ ID NO: 2) 



ATGGATCGTTCGGAGAACATCCGTCTG? 
1 0 TCGCCGTGGCCGCCGCTCTGGGAGTTC- 
GTACACCTGGAGCAACGTGGCGATCGG< 
ATTCTGTACGTGCGGACGGACATCGGGC 
ATTGGGTGGGATGGAACAATTGGGCGT^ 
ATGGGCCGCCGTCGGAATGTACACCAAl 



CTATGAGATCACGACGATTGGTATCACTGCTCGCCGCCACTGCGTCGT 

gcccatcgcgataacggcttctcctgcgcacgcggcgacgactcagcc 
ggcggcggctttgtcgacgggatcgtcttcaatgaaggtgcaccggga 
ggatgtatcgatgggatgccgccaacmgcggtggatccctcttctgg 
caacggcgtcgtcagcattgcggcagacccgatcaatactaacaaggt 
agctgggacccaaacgacggagcgattctccgctcgtctgatcagggc 



1 5 gcaacgtggcaaataacgcccctgccg' itcaagcttggcggcaacatgcccgggc^tggaatgggogagcggc^g 



cggtggatccaaacaatgacaacattc: 

CGGCGCGACCTGGTCCCAGATGACGAAi. 

tatcagagcgatattcaaggcgtogtc' 



; t^TATTTCGGCGCCCCGAGCGGCAAAGGGCTCTGGAGAAGCACAGATTC 
!. TITCCGGACGTAGGCACGTACATTGCAAATCCCACTGAOVCGACCGGC 
'GGGTCGCTTTCGACAAGTCTTCGTCATCGCTCGGGCAAGCGAGTAAGA 
CCATTTTTGTGGGCGTGGCGGATCCXA^TAATCC^TC^ 
20 GCCGGGTGCGCCGAOCGGCTTCATCCG ]f CACAAGGGCGTCTTTGACCCGGTCAACCACGTGCTCTATATTGCCAC C 
AGCAATACGGGTGG?CeGTATGACGGG;|>GCTCCGGCGA<^CT^ 
GAATCAGCCCGGTACCTTCGACGGACA^GGCCAACGACTACT^ 

cccgaacacgataatggtggcaaccca(|&tatcgtggtggccggacacc^ 
gcgacgtggacgcggatctgggattgg^gagttatcccaat^ 
25 agccttggctgaccttcggcgtacagc^gaatc 

ggc^tcgatccgttcaactctgatcgf|jatgctcta^ 

aagtgggactccggcggccagattcat-utcgcgccgatggtcaaaggattggaggagacggcgct 
tcagcccgccgtctggcgccccgctca pcagcgctctcggagacctcggcggcttcacccacgccgacgttactgc 

CGTGCCATCGAOGATCTTCACGTCACalJGTGTTCAO^^ 

30 at(^tcgttcgosctggaagtttcgat<|:catcga 
gcaagaactggttccaaggcagcgaac(]:tggcgggg 

tcgtttcgtctgggctcccggcgatco!:ggtcagcctgtggtgtacgcagtcggatt^ 



TCGCAAGGTGTTCCCGCCAATGCCCAG. 
GAACCTTCTATCGAAGCACGGACGGCG 

35 CGGTGTCATGTtCCACGCGGTGCCTGG, 
ACCAATGGCGGCAGCA6TTGGTCTGCA 
CCGGGTCGTCATACCCAGCCGTCTTTG' 
TGGGACGAC CTGGGTACTGATCAATGA' 
GCGAATTTACGGCGGGTGTACATAGGC, 

40 GATCGCCGTCTCCGTCGGTGAGTCCGTH 
GCOGTCGCCGTCGCCGAGCTCGAGTCC 



tccgctcagaccgggtgaatccaaagactttctatgccctatccaatg 
k:gtgagattccaaccggtcgcggccggtcttccgagcagcggtgccgt 

iAAAGAAGGCGATCTGTGGCTOGCTGOATCGAGCGGGCTTTACCACTCA 
vTCACCGGCGTATCCTCCGCGGTGAACGTGGGATTTGGTAAGTCTGCGC 
rCGTCGGCACGATCGGAGGCGTTACGGGGGCGTACCGCTCCGACGACTG 
^CCAGC^CCAATACGGAAATTGGGGACAAGCAATCACCGGTGACCAC 
tCGAACGGCCGTGGAATTGTATACGGGGACATTGGTGGTGCGCCGTCCG 
?GGCTTCGCraAGCCTGAGCCCGAGCCGGAGCCCGAGCAGCTCGCCATC 
^TCCTCGTCGCCGTCTCCGTCGCCGTCACCATCGCCGAGTCCGTCTCGG 



T(?rCMTCACCATCGGCGTCGCCGAGOj:CGT^^ 

CAACGCCGTCGTCGTCGCCGGTGTCGGjJTGGGGTGAAGGTGCAGTATAAGAATAATG 
TCAGATCAAGCCGGGTTTGCAGGTGGT^IsAATACCGGGTCGTCGTCGGTGGATTTGTCGACGGTGACGG 
45 TGGTTtoeeeGGGATGGTGGCTCGTCG|*CACTGGTO^ 

GCGCCTCGTTCGGCTCGGTGAACCCGQCGACGCCGACGGCGG^CACCTACCTGCAGN* 



The AviBl polynucleotide raoleculjis of the invention are preferably isolated molecules encoding the 
50 Aviin polypetide having an atninqjacid sequence as shown in Table 3 and SEQ ID NO: 1, as well as 
derivatives, variants, and useful fragments of the A vim polynucleotide. The AvtfE polynucleotide 
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sequence can include deletions, substitutions, or additions to the nucleic acid sequence of Table 4 
and SEQ ID NO: 2. : 

The AviHI polynucleotide moleculji of the invention can be cDNA, chemically synthesized DNA, 
5 DNA amplified by PCR, KNA, <jr combinations thereof Due to the degeneracy of the genetic 
code, two DNA sequences may differ and yet encode identical amino acid sequences. The present 
invention thus provides an isolated polynucleotide molecule having a Aviin nucleic acid sequence 
encoding Aviin polypeptide, wheje the nucleic acid sequenc encodes a polypeptide having the 
complete amino acid sequences as j shown in Table 3 and SEQ ID NO: 1 , or variants, derivatives, 
10 and fragments thereof. 

The AviHI polynucleotides of the ijavention have a nucleic acid sequence that is at least about 60% 
identical to the nucleic acid sequence shown in Table 4 and SEQ ID NO: 2, in some embodiments 
at least about 70% identical to the jiucleic acid sequence shown in Table 4 and SEQ ID NO: 2, and 
15 in other embodiments at least abojit 90% identical to the nucleic acid sequence shown in Table 4 
and SEQ ID NO: 2. Nucleic acid ijequence identity is determined by known methods, for example 
by aligning two sequences in a software program such as the BLAST program (Altschul, S.F et al. 
(1990) J. Mol. Biol. 2l5:403-4;j0, from the National Center for Biotechnology Information 
(htm://www,ncbi.nlm.nih.gov/BL/'j ST/). 
20 \ 

The Avim polynucleotide modules of the invention also include isolated polynucleotide 
molecules having a nucleic acid [sequence that hybridizes under high stringency conditions (as 

defined above) to a the nucleic acijl sequence shown in Table 4 and SEQ ID NO: 2. Hybridization 

i 

of the polynucleotide is to about li| contiguous nucleotides, or about 20 contiguous nucleotides, and 
25 in other embodiments about 30 contiguous nucleotides, and in still other embodiments about 100 
contiguous nucleotides of the nuchjic acid sequence shown in Table 4 and SEQ ID NO: 2. 

Useful fragments of the Aviffl-enCjXiing polynucleotide molecules described herein, include probes 
and primers. Such probes and pri|ners can be used, for example, in PCR methods to amplify and 
30 detect the presence of Aviin polynucleotides in vitro, as well as in Southern and Northern blots for 
analysis of Avilfl. Cells expressing the Avim polynucleotide molecules of the invention can also 
be identified by the use of such pjobes. Methods for the production and use of such primers and 
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piobes are known. For PCR, 5' an<ji3' primers corresponding to a region at the termini of the AvilH 
polynucleotide molecule can be employed to isolate and amplify the AvilU polynucleotide using 
conventional techniques. j 



Other useful fragments of the Avj 
comprising a single-stranded nuck 



31 polynucleotides include antisense or sense oligonucleotides 
ic acid sequence capable of binding to a target Avifltt mRNA 



(using a sense strand), or DNA (usiijig an antisense strand) sequence. 

i 

Vectors and Host Cells; 

10 the present invention also' pjovijdes vectors containing the polynucleotide molecules of the 
invention, as well as host cells transformed with such vectors. Any of the polynucleotide molecules 
of the invention may be contained jn a vector, which generally includes a selectable marker and an 
origin of replication, for propagation in a host. The vectors further include suitable transcriptional 
or transiatioual regulatory sequences, such as those derived from a mammalian* microbial, viral, or 

15 insect genes, operably linked to tftj? AviDI polynucleotide molecule. Examples of such regulatory 
sequences include transcriptional j promoters, operators, or enhancers, mRNA ribosoraal binding 
sites, and appropriate sequences w.|nch control transcription and translation. Nucleotide sequences 
are operably linked when the regulatory sequence functionally relates to the DNA encoding the 



target protein. Thus, a promoter n 
20 if the promoter nucleotide sequenctj 



icleotide sequence is operably linked to a A vim DNA sequence 
directs the transcription of the AvilU sequence. 



1 
i 

Selection of suitable vectors for thj; cloning of Avilll polynucleotide molecules encoding the target 
AviHI polypeptides of this invention will depend upon the host cell in which the vector will be 
transformed, and, where applicable, the host cell from which the target polypeptide is to be 
25 expressed. Suitable host cells for ijapression of AvilU polypeptides include prokaryotes, yeast, and 
higher eukaryoric cells, each of whjch is discussed below. 

i 

The AvilU polypeptides to be expressed in such host cells may also be fusion proteins that include 
regions from heterologous proteins; As discussed above, such regions may be included to allow, for 
30 example, secretion, improved stability, or facilitated purification of the Avim polypeptide. For 
example, a nucleic acid sequence encoding an appropriate signal peptide can be incorporated into an 
expression vector. A nucleic acitt sequence encoding a signal peptide (secretory leader) may be 
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i;e so that AviXH is translated as a fusion protein comprising the 
signal peptide, A signal peptide tbjat is functional in the intended host cell promotes extracellular 
secretion of the AviM polypeptide, j Preferably, ihe signal sequence will be cleaved from the AvtfH 
polypeptide upon secretion of Aviljjl from the cell. Non-limiting examples of signal sequences that 
5 can be used in practicing the invention include the yeast I-factor and the honeybee melatin leader in 

S© insect cells. ! 

i 
i 
! 

Suitable host cells for expression ojf target polypeptides of the invention include prokaryotes, yeast, 
and higher eukaryotic cells. Suable prokaryotic hosts to be used for the expression of these 
10 polypeptides include bacteria of ^jhe genera Escherichia. Bacillus, and Salmonella, as well as 
members of the genera P$eudon^>nas> Streptomyces, and Staphylococcus, For expression in 
prokatyotic cells, for example, in i ' colU the polynucleotide molecule encoding AviHI polypeptide 
preferably includes an N-terarinal (methionine residue to facilitate expression of the recombinant 
polypeptide. The N-terminal Met liiay optionally be cleaved from the expressed polypeptide. 

Expression vectors for use in pijokaryotic hosts generally comprise one or more phenotypic 
selectable marker genes. Such jjenes encode, for example, a protein that confers antibiotic 
resistance or that supplies an auxotrophic requirement. A wide variety of such vectors are readily 
available from commercial source*;. Examples include pSPORT vectors, pGEM vectors (Promega, 
20 Madison, WI) 7 pPROEX vectors j|LTI, Bethesda, MD)> Bluescript vectors (Stratagene), and pQE 
vectors (Qiagen). 

Avilfl can also be expressed in yejist host cells from genera including Saccharomyces, Pichia. and 
Kluveromyces. Preferred yeast h^jsts are S. cerevisiae and P. pas tons. Yeast vectors will often 

25 contain an origin of replication s*j<juence from a 2T yeast plasmid, an autonomously replicating 
sequence (ARS), a promoter region, sequences for polyadenylation, sequences for transcription 
termination, and a selectable marijsr gene. Vectors rephcable in both yeast and & coli (termed 
shuttle vectors) may also be used,! In addition to the above-mentioned features of yeast vectors, a 
shuttle vector will also include sequences for replication and selection in £. colL Direct secretion of 

30 the target polypeptides expressed iji yeast hosts may be accomplished by the inclusion of nucleotide 
sequence encoding the yeast I-factin* leader sequence at the 5' end of the Avim-encoding nucleotide 
sequence. 
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fasect host cell culture systems cadi also be used for the expression of AvilU polypeptides. The 
target polypeptides of the invcntionlare preferably expressed using a baculovirus expression system, 
as described, for example, in the re\jiew by Luckow and Summers, 1988 Bio/Technology 6:47. 

The choice of a suitable expressioji vector for expression of AvilU polypeptides of the invention 
will depend upon the host cell to jbe used. Examples of suitable expression vectors for E. coti 
include pET, pUC, and similar veejtors as is known in the art. Preferred vectors for expression of 
the AvilU polypeptides includcj the shuttle plasmid pU702 for Streptomyces Ifvidans, 
10 pGAFZalpha-A, B, C and pFICZalplja-A, B, C (fcivitrogen) for Pichia pasioris, and pFE-1 and pFE-2 
for filamentous fungi and similar vectors as is known in the art 

i 

Modification of a Avini polynucleotide molecule to &cflitate insertion into a particular vector 
(for example, by modifiying restric tion sites), ease of use in a particular expression system or host 
15 (for example, using preferred hosi; codons), and the like, are known and are contemplated for use 
in the invention. Generic engineering methods for the production of AvilU polypeptides include 
the expression of the polynucleotide molecules in cell free expression systems, in cellular hosts* 
in tissues, and in animal models, qpcording to known methods. 

20 Compositions 

i 

The invention provides composirlpns containing a substantially purified AviHI polypeptide of the 
invention and an acceptable carrier. Such compositions are administered to biomass, for 
example, to degrade the cellulose! in the biomass into simpler carbohydrate units and ultimately, 
to sugars. These released sugars jtom the cellulose are converted into ethanol by any number of 
25 different catalysts. Such compositions may also be included in detergents for removal, for 
example, of cellulose containing sjpains within fabrics, or compositions used in the pulp and paper 
industry, to address conditions associated with cellulose content. Compositions of the present 
invention can be used in stonewa^jhing jeans such as is well known in the art. Compositions can 

be used in the biopolishing of celfylosic fabrics, such as cotton, linen, rayon and Lyocell, 

i 

30 ! 

The invention provides pharmacj:utical compositions containing a substantially purified AviTJI 
polypeptide of the invention arji if necessary a pharmaceutical^ acceptable carrier. Such 
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pharmaceutical compositions are aij ministered to cells, tissues, or patients, for example, to aid in 

delivery or targeting of other pharmaceutical compositions. For example, AvilH polypeptides 

may be used where catbohydrate-rjiediated liposomal interactions are involved with target cells. 

Vyas SP et al. (200l)> J. Pharmacy jfe Pharmaceutical Sciences May-Aug 4(2): 138-58. 

5 ! 

The invention also provides reagents, compositions, and methods that are useful for analysis of 

AvilH activity and for the analysis cellulose breakdown. 

Compositions of the present invention may also include other known cellulases, and preferably, 
\ o other known thermal tolerant celluj|ases for enhanced treatment of cellulose, 

i 
i 

Antibodies 

The polypeptides of the present indention, in whole or in pan, may be used to raise polyclonal and 
monoclonal antibodies that are jiseful in purifying A villi, or detecting A vim polypeptide 
15 expression, as well as a reagenoj tool for characterizing the molecular actions of the AviHI 

polypeptide. Preferably, a peptide containing a unique epitope of the Avilll polypeptide is used in 

i 

preparation of antibodies, using conventional techniques. Methods for the selection of peptide 

i 

epitopes and production of antiba|lies are known. See, for example, Antibodies: A Laboratory 
Manual, Harlow and Land (eds.), ]i988 Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
20 N.Y.; Monoclonal Antibodies, H}\bridomas: A New Dimension in Biological Analyses, Kennet et 

al (eds.), 1980 Plenum Press, NewjYork. 

I 

i 
I 

Assays | 

Agents that modify, for example increase or decrease, AvilH hydrolysis or degradation of 
25 cellulose can be identified, for example, by assay of AviDI cellulase activity and/or analysis of 



AvilH binding to a cellulose subsr* 
presence or absence of a test age* 
permits screening of such agents 
performed in a manner similar to 
30 1714 (April 1998). 



ate. Incubation of cellulose in the presence of AvilH and in the 
it and correlation of cellulase activity or carbohydrate binding 
For example, cellulase activity and binding assays may be 
those described in Irwin et al., J. Bacteriology 180(7): 1709- 



i 
i 
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I 

i 

The Avilll stimulated activity is dt termined in the presence and absence of a test agent and then 
compared, A lower Avilll activa ed test activity in the presence of the test agent, than in the 
absence of the test agent, indicate^ that the test agent has decreased the activity of the Avilll. A 
higher Aviin activated test activity! in the presence of the test agent than in the absence of the test 
5 agent indicates that the test ageijt has increased the activity of the Avilll. Stimulators and 
inhibitors of Avim may be usedjto augment, inhibit, or modify Aviin mediated activity, and 
therefore may have potential industrial uses as well as potential use in the further elucidation of 
A villi's molecular actions. ! 

i 

10 Therapeutic Applications 

The Aviin polypeptides of the indention are effective in adding in delivery or targeting of other 
pharmaceutical compositions with|n a host. For example* Avim polypeptides may be used where 
carbohydrate-mediated liposomal ! interactions are involved with target cells. Vyas SP et al. 
(2001), X Pharm Pharm Sci May-.jvug 4(2): 138-58. 

15 

Avim polynucleotides and polypeptides, including vectors expressing Avim, of the invention can 
be formulated as pharmaceutical cjampositions and administered to a host, preferably mammalian 
host, including a human patienjt, in a variety of forms adapted to the chosen route of 
administration. The compounds are preferably administered in combination with a 
20 pharmaceutical^ acceptable carrujr, and may be combined with or conjugated to specific delivery 
agents, including targeting antibodies and/or cytokines. 

Avilll can be administered by\ known techniques, such as orally, parentally (including 
subcutaneous injection, intravenous, intramuscular, intr asternal or infusion techniques), by 

irption through a mucous membrane, or rectaily, in dosage unit 
mal non-toxic pharmaceutical^ acceptable carriers, adjuvants 
or vehicles. Pharmaceutical compositions of the invention can be in the fdnn of suspensions or 
tablets suitable for oral administration, nasal sprays, creams, sterile injectable preparations, such 
as sterile injectable aqueous or oogenous suspensions or suppositories. 



i 

25 inhalation spray, topically, by ab& 
formulations containing convent* 



30 



t 

For oral administration as a suspension, the compositions can be prepared according to 
techniques well-known in the artjof pharmaceutical formulation. The compositions can contain 
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■ microcrystalline cellulose for importing bulk, alginic acid or sodium alginate as a suspending 
agent, raethylcelluiose as a viscosity enhancer, and sweeteners or flavoring agents. As immediate 
release tablets, the composition&j can contain microcrystalline cellulose, starch, magnesium 
stearate and lactose or other excipjents, binders, extenders, disintegrauts, diluents and lubricants 

5 known in the art. j 

i 

For administration by inhalation jor aerosol, the compositions can be prepared according to 
techniques well-known in the ac| of pharmaceutical formulation. The compositions can be 
prepared as solutions in saline, mjing benzyl alcohol or other suitable preservatives, absorption 
10 promoters to enhance bioavailability, fluorocarbons or other solubilizing or dispersing agents 

i 

known in the art j 

i 

For administration as injectable ablutions or suspensions, the compositions can be formulated 
according to techniques well-knpwn in the an, using suitable dispersing or wetting and 
1 5 suspending agents, such as sterile pils, including synthetic mono- or diglycerides, and fatty acids, 

including oleic acid. j 

i 

i 
i 

For rectal administration as suppositories, the compositions can be prepared by mixing with a 
suitable non-irritating excipient, siitch as cocoa butter, synthetic glyceride esters or polyethylene 
20 glycols, which are solid at ambiejit temperatures, but liquefy or dissolve in the rectal cavity to 
release the drug. 1 

i 

Preferred administration routes include orally, parenterally, as well as intravenous, intramuscular 
or subcutaneous routes. More; preferably, the compounds of the present invention are 
25 administered parenterally, i e., intravenously or iniraperitoneally, by infusion or injection. 

Solutions or suspensions of the compounds can be prepared in water, isotonic saline (PBS) and 
optionally mixed with a nontoxic jiurfactant. Dispersions may also be prepared in glycerol, liquid 
polyethylene, glycols, DNA, vegetable oils, triacetin and mixtures thereof. Under ordinary 
30 conditions of storage and use, the*je preparations may contain a preservative to prevent the growth 
of microorganisms. 

i 
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The pharmaceutical dosage form suitable fox injection or infusion use can include sterile, aqueous 

i 

solutions or dispersions or sterile pjowders comprising an active ingredient which are adapted for 
the extemporaneous preparation 01[ sterile injectable or infusible solutions or dispersions. In all 
cases, the ultimate dosage form 'should be sterile, fluid and stable under the conditions of 

5 manufacture and storage. The liquid carrier or vehicle can be a solvent or liquid dispersion 
medium comprising, for example, jwater, ethanol, a polyol such as glycerol, propylene glycol, or 
liquid polyethylene glycols and t||te like, vegetable oils, nontoxic glyceryl esters, and suitable 
mixtures thereof. The proper fluidity can be maintained, for example, by the formation of 
liposomes, by the maintenance of jthe required particle size, in the case of dispersion, or by the 

10 use of nontoxic surfactants. The pr evention of the action of microorganisms can be accomplished 
by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, 
sorbic acid, thimerosal, and the iike. In many cases, it will be desirable to include isotonic 

■i 

agents, for example, sugars, buffers, or sodium chloride. Prolonged absorption of the injectable 
compositions can be brought abput by the inclusion in the composition of agents delaying 
15 absoiption-for example, aluxninutji monosterate hydrogels and gelatin. 

Sterile injectable solutions are prepared by incorporating the compounds in the required amount 
in the appropriate solvent with various other ingredients as enumerated above and, as required, 
followed by filter sterilization, jln the case of sterile powders for the preparation of sterile 
20 injectable solutions, the preferred! methods of preparation are vacuum drying and fireeze-drying 
techniques, which yield a powderi of the active ingredient plus any additional desired ingredient 
present in the previously sterile-fij|fered solutions. 

Industrial Applications ' 

25 The AviHI polypeptides of the invention are effective cellulases. In the methods of the invention, 
the cellulose degrading effects of lAviHI are achieved by treating biomass at a ratio of about 1 to 
about 50, or about 1:40, 1:35, l::jO, 1:25 , 1:20 or even about 1: 70 in some preparations of ihe 
AVIHI of AviIII:biomass. AvilHj may be used under extreme conditions, for example, elevated 
temperatures and acidic pH, Trejtted biomass is degraded into simpler forms of carbohydrates, 

30 and in some cases glucose, wbiclh is then used in the formation of ethanol or other industrial 
chemicals, as is known in the anj. Other methods are envisioned to be within the scope of the 
present invenrion, including methods for treating fabrics to remove cellulose-containing stains 
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and other methods already discussed. Avim polypeptides can be used in any known application 
currently utilizing a cellulase, all oij which are within the scope of the present invention- 
Having generally described the invjintion, the same will be more readily understood by reference 
5 to the following examples, which! are provided by way of illustration and are not intended as 
limiting, j 

EXAMPLES 

10 Example 1: Molecular Cloning tjf A viHl 

Genomic DNA was isolated from \cidoihermus cellulolyticus and purified by banding on cesium 
chloride gradients. Genomic DNa| was partially digested with Sau 3 A and separated on agarose 
•gels. DNA fragments in the range jof 9~20 kilobase pairs were isolated from the gels. This 
purified Sau 3A digested genomic.jDNA was ligated into the Bam HI acceptor site of purified 

i 

15 EMBL3 lambda phage arms (Clorrjecb, San Diego, Calif.)- Phage DNA was packaged according 
to the raanufecturer's specificauonjand plated with E. Coli LE392 in top agar which contained the 
soluble cellulose analog, carboxynj etbylcellulose (CMC). The plates were incubated overnight 
(12-24 hours) to allow transfectiof j, bacterial growth, and plaque formation. Plates were stained 
with Congo Red followed by destining with 1 M NaCL Lambda plaques harboring 

20 endoglucanase clones showed up f|S unstained plaques on a red background. 

I 
i 

Lambda clones which screened pojsitive on CMC-Congo Red plates were purified by successive 
rounds of picking, plating and screening. Individual phage isolates were named SL-1, SL-2, SL- 
3, and SL-4. Subsequent subclonijig efforts employed the SL-3 clone which contained an 
25 approximately 14.2 kilobase fragment of Acidothermus cellulolyticus genomic DNA. 

Template DNA was constructed njiing a 0 kilobase Bam HI fragment obtained from the 14.2 

kilobase lambda clone SL-3 prepqj-ed from Acidothermus cellulolyticus genomic DNA. The 9 

i 

kilobase Bam Hi fragment from S|L-3 was subcloned into pDRS40 to generate a plasmid 
30 NREL50 1 . NREL50 1 was sequenced by the primer walking method as is known in the art. 
NREL501 was then subcloned infcp pUCl9 using restriction en2ymes Pst I and Eco RI and 
transformed into £. coli XLl-bluej (Stratagene) for the production of template DNA for 

PAGE 33^37 ' RCVD AT 5rti20D5 12:45: 15 PM [Eastern Daylight Tme] 1 SVRUSPTO-EFXRF-1/25 1 DHlS:273830O ' CSID:m3847499 * DURATION (mnvss):12-16 



05-05-05 10:56AM FRQM-NREL LEGAL OFFICE 3033847499 T-488 P. 34/37 F-307 

Attorney Cwet No. NREL 01-36 



10 



-32- 



sequencing. Each subclone was sequenced from both the forward and reverse directions. DNA 
for sequencing was prepared from *jn overnight growth in 500 mL LB broth using a megaprep 
DNA purification kit from Promegi The templated DNA was PEG precipitated and suspended 
in de-ionized water and adjusted toja final concentration of 0.25 milligrams/mL. 



Custom primers were designed by Reading upstream known sequence and selecting segments of 
an appropriate length to function, ap is well known in the art. Primers for cycle sequencing were 
synthesized at die Macromolecular Resources Facility located at Colorado State University in 
Fort Collins s Colorado. Typically jibe sequencing primers were 26 to 30 nucleotides in length, 
but were sometimes longer or shower to accommodate a melting temperature appropriate for 
cycle sequencing. The sequencing jprimers were diluted in de-ionized water, the concentration 
measured using UV absorbance at |>60 ran, and then adjusted to a final concentration of 5 
pmoVmicroL. J 

j 

15 Templates and sequencing primers were shipped to the Iowa State University DNA Sequencing 
Facility at Ames, Iowa for sequencing using standard chemistries for cycle sequencing. In some 
cases, regions of the template that jiequenced poorly using the standard protocols and dye 
terminators were repeated with ttaj addition of 2 microL DMSO and by using nucleotides 
optimized for the sequencing of hi j*h GC content DNA. An inverse PCR technique known in the 

20 an was applied to continue sequencing the genomic DNA, and a primer walking method was 
used to sequence the large PCR products. Each PCR fragment was sequenced from both strands, 
using high fidelity commercial D>| A polymerase. 

Sequencing data from primer wattjing and subclones were assembled together to verify that all 
25 Sl^3 regions had been sequenced iom both strands. An open reading frame (ORF) was found in 
the 9 kilobase Bam HI fragment, ip-terraiual of El (U.S. Patent 5,536,655), termed A vim. An 
ORF of 3366 bp [SEQ ID NO:2J *jnd deduced amino acid sequence [S£Q ID NO:i] are shown in 
Tables 3 and 4. The amino acid sequence predicted by SEQ ID NO: 1 was determined to have 
significant homology to known celllulases, as is shown below in Example 2 and Table 5. 
30 i 



• 
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The amino acid sequence represent;] a novel member of the family of proteins with cellulase 
activity. Due to the source of isolation, from die thermophilic Acidothermus cellulolyticus, AviHI 
is a novel member of celluloses witji properties including thermal tolerance. It is also known that 
thermal tolerant enzymes may havej other properties (see definition above). 

5 i 

i 

i 

Example 2: AviHI Includes a Gl([74 catalytic domain 

Sequence alignments and comparisons of the amino acid sequences of the Acidothermus 
10 cellufofyrieus Aviin catalytic domain (approximately amino acids 37 to 776) and Aspergillus 

aculeatus Avicelase IQ (endoglucajiase) polypeptides were prepared, using the ClustalW program 
(Thompson J.D et al. (1994), Nucleic Acids Res. 22:4673-4680 from EMBL European 
Bioinfoimatics Institute website (hl-tp://www.ebi.ac.uk/)). An examination of the amino acid 
sequence alignment of the GH74 domain indicates that the amino acid sequence of AvilD 
15 catalytic domain is homologous to ithe amino acid sequence of a known GH74 family catalytic 
domains for Aspergillus acw/eatftffAvicelase TR (endoglucanase) (see Table 5). In Table 5, the 
notations are as follows: an asteris|c "* n indicates identical or conserved residues in all sequences 
in the alignment; a colon indicyjtes conserved substitutions; a period indicates semi- 
conserved substitutions; and a hypjien "-" indicates a gap in the sequence. The amino acid 
20 sequence predicted for the AviHI <jrH74 domain is approximately 46 % identical to the 

Aspergillus aculeatus Avicelase U|: (eudoglucanase) GH74 domain, indicating that the AviHI 
catalytic domain is a member of ttje GH74 family (Henrissat et aL, (1991) supra). 
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Table 5. Multiple amino acid sequence alignment of a AvUDQ catalytic domain and 
polypeptides with Glycoside Hydrolase Family 74 catalytic domains 

Mulpalignmem of relaxed Glycoside Hydrolase Pajni| y 74 ctnalync domain 

CH74 Ace: AvMorhermus cdfohtytitux AvllU cuta|j/tie domain GH74 

Avim'Aac: Aspergillus acutemo ArtcaJaat VX (oidj.gl*JCaH*5fl) GcncBmik Act. fi BAA20031 

~" i 

GH74 Acc ATTQPYTwStJVA IGGGG - F jfDGl VFNEGAPG I LYVRTDI GGmYRWDAANGRW I FLLDWVG 

AvilTl Aac AA3(^YTWKbTVVTGGGGGF|?PGIVFNPSAKG^ 

~ wwww <t| ^ ww ww . w * ; . «*^-r*-w w* : . . * "l"** 

CH74 Ace WNNWGYJX^9IAADPINT|nCVWAAV^ 
AvitTl_A&C iroTWHDwGIDAl^TDPVDTpiT/YvAVG^ 

GH74 Ace G*IMPG&^ERIAVPPWin;hlI/*P^^ 
Avi I II_Aac <^NFGR<MGS3U*vPPllK»|5ItYF*^ 

GH?4 Aw TTGYOSDIQOVVWAP^K^SSSLGQASm 

AvilTl Aac T--YTSOPVGIAWVTFPS1 3GSSa«ATPKItvGVADACKSVFIcaEPAGATWAWV3Cfi?Qy 



25 



30 



GH74_Ace 
AvilII A&C 



GM74_Ace 
AvilTl Aac 



GPI PHKGVFDPVKHVLYIA|r$mX3GPYIX3S^r)VWt^SVrSGTHTRI SFVPSTDTaNDYF 
GFLPHKGVLSPEEKTLYI W£ANt3AGPYpGTNGTVHKYNITSGVWTDI SP- - -TSIASTYY 

WW; WW***; .» ...WWW,' ;W « «** ». *. » J 

GYSGLT JDRQHPDTIMVA^jai SWWPDT 1 1 ?R5TDCOATWTR.I WDWT^YPNRSLRYVLDI S 
GYGGl^VPLQVPGTX4<V7ulLNC^WPDEL:FHSTDSaATK5Prw 
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GOT^Ace 
AvillI Aac 



GH74_Ace 
Avilll^Aac 



GH74_Ace 
AvilTl AaC 



QH74_Ace 
AvilII Aac 



GH74_Aee 
Avil II Aac 



GH74_Ace 
Avi III A&c 



GK74_Ace 
AviIJI_Aac 



AG PWJ-TFGVQPfl PpVpS PJ-lLGWMDRAWA J DPfN 3DRML YGTGATLYATNOLTKHDSQGQ I 



NApWIQETTST DQPP- -V? 



H I APMVJGGLEETAVNDL I \ 
TVK3I-AVGrE£MAVT.GLI , (l 



VGWKVEALAI DPPDSNHWLYGTGLTVYGGHDLTNWDSXHNV 



PPSGAPLI SALGPIjCGFTHADnTTAVP ST I FTSPVFTTGTSV 
PPGG PA£. L 5AVGDDGGF YH 8DLD AAPNQAYHT PTY GTTNG I 



DYAELNPSI IVRAGSFDP. j SOPHDJ^VAFSTPGGKNWrQGSEPGGVrTGGTVAASAPGSft 
DYAGNKPSnIVRSGASDiHp TLALSSllFGSTWYAOYAASTSTGTGAVALSADGDT 



f« * <- w v 



FV^!Ap<^PQQPvVVAVGP(pSWRASOGvPAlJAQI RSDRVNPKTFYAL3NGTFVRSTDGGV 
VT-LM8 STSQALrVSKSQG - | - TLTAVS 3LPSGAVXASDKSDJ3T VPYGG3AGAI YVSKNXAT 

■ ; m j ; ; * w ww ; ; » * ; 5 » »,J ,, 

TPQPVAAGLPSSGAVGVM ]-HAVpCKEODLlfUAASSGLY>lSTNGG3SWSAI -TGVSSAVHV 
SFTKUVS - JiSSSTTVXA I J : - AHP S I AGDVWASTDKGLWHSTD YGST FTQI GSGVTAGW3 P 
; w * ** ■ " . ww,+ ":n * »** 

Q PG KSA PG S SY PAVFWQ j ' I COVTGAY RS DDCGTTWVL I UDDQHQYQN - WGQA I TGDJiAN 
GFGKASSTGSYVV I YCFPl'I DGAAGI.PKSEDAGTNWQV I SOASHGPGSGSANWUGDLQT 
""»;;. .** - i w .**. * i w . . i . i .** 

LRKVYIGTNGp.GXVYGDlijsGAPSC 
YGRVFRGHfiRPGHLLRQ&bREPAG 



65 



Example 3: Mixed Domain GHf4, CBD II, CBD III Genes and Hybrid Polypeptides 

From the putative locations of tty: domains in the AvilII cellulase sequence given above and in 
compatable cloned cellulase seqijjnices from other species, one can separate individual domains 
and combine them with one or m< jre domains from different sequences. The significant similarity 
between cellulase genes permit oioe by recombinant techniques to arrange one or more domains 
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ftora the Acidothermus ceIhJofytt\w AviEU cellulase gene with one or more domains from a 
cellulase gene from one or raoiej other microorganisms. Other representative endoglucanase 
genes include Bacillus polymyxa tjeta-(l,4) endoglucanase (Baud et al, Journal of Bacteriology, 
172: 1576-86 (1992)) and Xanthc\mona$ campesiris bem-{l,4>eudogJucanase A (Gough et al, 

6 Gene 89:53-59 (1990)). The resijtlt of the fusion of any two or more domains will, upon 
expression, be a hybrid polypepridjr. Such hybrid polypeptides can have one or more catalytic or 
binding domains. For ease of manipulation, recombinant techniques may be employed such as the 
addition of restriction enzyme sitejs by site-specific mutagenesis. If one is not using one domain 
of a particular gene, any number of any type of change including complete deletion may be made 

10 in the unused domain for convenience of manipulation. 

It is understood for purposes of tjjds disclosure, that various changes and modifications may be 
made to the invention that are well within the scope of the invention. Numerous other changes 
may be made which will readily ! suggest themselves to those skilled in the art and which are 
15 encompassed in the spirit of the indention disclosed herein and as defined in the appended claims. 

This specification contains numerpus citations to references such as patents, patent applications, 
and publications. Each is hereby incorporated by reference for all purposes. 
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